Measures of Specificity Used in the Principle of Justifiable Granularity: A Theoretical Explanation of Empirically Optimal Selections
نویسندگان
چکیده
To process huge amounts of data, one possibility is to combine some data points into granules, and then process the resulting granules. For each group of data points, if we try to include all data points into a granule, the resulting granule often becomes too wide and thus rather useless; on the other case, if the granule is too narrow, it includes only a few of the corresponding point – and is, thus, also rather useless. The need for the trade-off between coverage and specificity is formalized as the principle of justified granularity. The specific form of this principle depends on the selection of a measure of specificity. Empirical analysis has show that exponential and power law measures of specificity are the most adequate. In this paper, we show that natural symmetries explain this empirically observed efficiency. I. FORMULATION OF THE PROBLEM Granular computing: a brief reminder. In many practical situations, it is difficult to deal with the whole amount of data: • it may be that we have too much data, so it is not feasible to apply the usual data processing algorithms to the data as a whole; this is the situation known as big data; see, e.g., [9]; • it may be that while in principle, it is possible to eventually process all the data points, this would take longer time than we have – e.g., when we need to make a decision right away; • it may also be that we want to use our intuition to better process the data, and to use our intuition, we need to present the data in presentable form. There may be other cases when we have too much data. To deal with such cases, a natural idea is compress the original data into a smaller set. The overall amount of available data can be estimated by multiplying the overall number of data points by the average amount of bits in each data point. In general, each data point does not carry too much information, so the main way to decrease the overall amount of information is to decrease the number of data points. Of course, we could simply take a sample from the original data set, but that would deprive us of all the information provided by the un-used data points. A much better idea is to each each new “data point” correspond to several original ones. This “combined” data point is known as a granule, and the resulting technique is known as granular computing. The general idea of granular computing can be traced to Lotfi Zadeh [21]; for latest developments, see, e.g., [16], [17]. There are many possible types of granules. For example, instead of several numerical values: • we can consider intervals that contain all – or at least most – of the data points; see, e.g., [5], [9], [12]; • we can consider fuzzy sets, that describe not only which values are possible, and also to what degree different data points are possible; see, e.g., [1], [6], [10], [14], [20]; • we can consider type-2 fuzzy or probabilistic granules; see, e.g., [10], [11] • we can consider rough sets, etc. How to combine data points into a granule: towards the principle of justifiable granularity. Once we have selected a group of data points that we want to compress into a granule, the question is which granule to select based on these data points. • If we try to include all data points into a granule, the resulting granule often becomes too wide and thus rather useless. • On the other case, if the granule is too narrow, it includes only a few of the corresponding point – and is, thus, also rather useless. We thus need to achieve a trade-off between coverage and specificity. According to decision theory (see, e.g., [3], [4], [7], [13], [18]), decisions of a rational decision maker can be described as optimizing the expected value of a special function – called utility function u(s) – that describes the corresponding preference. In other words, if after making a selection a, we get situations s1, . . . , sn with probabilities p1(a), . . . , pn(a), then we should make a selection for which the expected value p1(a) · u(s1) + . . .+ pn(a) · u(sn) attains its largest possible value. One can easily check that if we replace the utility function u(a) by a re-scaled one u1(s) = k · u(s) + l, (1) then we get the same order between selections. Vice versa, if two utility functions u(s) and u1(s) always lead to the same decisions, then these two functions are linearly related, i.e., there exist constants k > 0 and l for which the formula (1) holds for all situations s. In this sense, utility is similar to physical quantities like time or temperature, whole numerical values can change if we select: • a different measuring unit and/or • a different starting point. In our case, hen we replace several data points, we lose information, so in this case, the utility is negative. In our problem, we have two situations. For some points, we replace these points with a granule. • The probability P of this replacement can be naturally computed as the proportion of data points that fit into the corresponding granules. This proportion depends on the size of the granule: depends on the size ε of the granule: P = P (ε): the larger the size, the higher the proportion. • The utility of this replacement also depends on the size ε of the granule: u = u(ε): the larger the size, the smaller the utility. Other points do not fit into the granule and are, thus, simply dismissed (or at least processed in a more complex way). • The probability of this dismissal (or alternative processing) is, clearly, the remaining probability 1−P (ε). item Let us denote the utility of this dismissal (or alternative processing) by u0. According to decision making, we thus need to select the size ε that maximizes the expected utility P (ε) · u(ε) + (1− P (ε)) · u0. This expression can be equivalently rewritten as P (ε) · S(ε) + u0, (2)
منابع مشابه
The Principle of Justifiable Granularity and an Optimization of Information Granularity Allocation as Fundamentals of Granular Computing
Granular Computing has emerged as a unified and coherent framework of designing, processing, and interpretation of information granules. Information granules are formalized within various frameworks such as sets (interval mathematics), fuzzy sets, rough sets, shadowed sets, probabilities (probability density functions), to name several the most visible approaches. In spite of the apparent diver...
متن کاملComparison of entropy generation minimization principle and entransy theory in optimal design of thermal systems
In this study, the relationship among the concepts of entropy generation rate, entransy theory, and generalized thermal resistance to the optimal design of thermal systems is discussed. The equations of entropy and entransy rates are compared and their implications for optimization of conductive heat transfer are analyzed. The theoretical analyses show that based on entropy generation minimizat...
متن کاملLimiting Properties of Empirical Bayes Estimators in a Two-Factor Experiment under Inverse Gaussian Model
The empirical Bayes estimators of treatment effects in a factorial experiment were derived and their asymptotic properties were explored. It was shown that they were asymptotically optimal and the estimator of the scale parameter had a limiting gamma distribution while the estimators of the factor effects had a limiting multivariate normal distribution. A Bootstrap analysis was performed to ill...
متن کاملDesigning and explanation a model of Quality of Work Life of offshore Oil Industry Employees with mixed research (Case study: West Oil & Gas Production Company)
The aim of this study was to design and explain the quality of life of the offshore employees of the oil industry. Based on the purpose, this research is applied and in terms of data collection method, exploratory mix based on interpretive structural approach. The statistical population in the qualitative section was 12 experts who were selected based on the principle of theoretical adequacy an...
متن کاملON OPTIMAL NOZZLE SHAPES OF GAS-DYNAMIC LASERS
Pontryagin's principle is used to study the shape of the supersonic part of the nozzle of a carbon dioxide gas-dynamic laser whose gain is maximal. The exact shape is obtained for the uncoupled approximation of Anderson's bimodal model. In this case, if sharp corners are allowed, the ceiling of the supersonic part consists of a slant rectangular sheet followed by a horizontal one; otherwise...
متن کامل